PATENT 

C. AMENDMENTS TO THE CLAIMS 

1. (Currently Amended) A computer- implemented method of 
analyzing a data source that includes a plurality of 
household records , said method comprising: 
determining that the data source is not a subset of a 
reference file; 

in response to determining that the data source is not a 
subset of the reference file, retrieving, from a 
nonvolatile storage area, a sample quantity of household 
records included in the data source, wherein the sample 
quantity of household records does not include all of the 
plurality of household records included in the data source; 
comparing each household record included in the sample 
quantity of household records to the reference file; 
comparing the data source to a reference file; 
determining, based upon the comparing, whether the data 
source is balanced, signifying that the sample quantity of 
household records represents the data source; and whether 
the data source is balanced in response to the comparing ; 

adjusting , based upon determining whether the data source 
is balanced, the sample quantity of household records such 
that the adjusted sample quantity of household records is 
balanced and represents the data source, the data source 
based on the determining, wherein the adjusting results in 
a more balanced data source. 

2. (Currently Amended) The method as described in claim 1 
further comprising: 
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matching one or more of the household records included in 
the adjusted sample quantity of household records records 
from the data source to one or more reference file records; 
generating a comparison master file based on the matching; 
and 

assigning an index number to each record in the comparison 
master file. 

3. (Original) The method as described in claim 1 further 
comprising : 

retrieving a rule corresponding to an element in the data 
source ; 

determining whether the element in the data source 
approximates a corresponding value in the reference file 
based on the retrieved rule; and 

assigning a match to the element in response to the 
determination . 

4. (Currently Amended) The method as described in claim 1 
further comprising: 

calculating a first bias value based upon matching one or 
more of the household records included in the adjusted 
sample quantity of household records records from the data 
source to one or more reference file records^-; — aftd- 
calculating a first bias value based upon the matching. 

5. (Currently Amended) The method as described in claim 4 
further comprising: 

matching one or more records from a second data source to 
one or more reference file records; 
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calculating a second bias value based upon matching one or 
more household records from a second data source to one or 
more reference file records; and the matching; — asd- 
wherein the selecting includes comparing the first bias 
value to the second bias value. 

6. (Currently Amended) The method as described in claim 1 
further comprising: 

identifying a first data source sample size; 
comparing a first data source sample corresponding to the 
first data source sample size to the reference file; 
determining a match percentage based on comparing each of 
the household records included in the adjusted sample 
quantity of household records to the reference file; tke 
comparing; and 

calculating a new source file sample quantity second data 
source sample size by dividing the adjusted number of 
household records first data source sample size by the 
match percentage. 

7. (Currently Amended) The method as described in claim 6 
further comprising: 

identifying a second data source corresponding to the new 
source file sample quantity; and second data source sample 



matching one or more records from the second data source to 
one or more reference file records; — aftd- 

calculating a second match percentage based on matching one 
or more records from the new source file sample quantity to 
one or more reference file records, the matching. 
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8. (Currently Amended) An information handling system 
comprising : 

one or more processors; 

a memory accessible by the processors; 

one or more nonvolatile storage devices accessible by the 
processors ; 

a data source handling tool to manage a data source stored 
on one of the nonvolatile storage devices, the data source 
handling tool including: 

means for determining that the data source is not a 

subset of a reference file; 

in response to determining that the data source is not 
a subset of the reference file, means for retrieving, 
from one of the nonvolatile storage devices, a sample 
quantity of household records included in the data 
source, wherein the sample quantity of household 
records does not include all of the plurality of 
household records included in the data source; 
means for comparing each household record included in 
the sample quantity of household records to the 
reference file; 

means for comparing the data source to a reference 
file stored on one of the nonvolatile storage devices; 
means for determining, based upon the comparing, 
whether the data source is balanced, signifying that 
the sample quantity of household records represents 
the data source; and whether the data source is 
balanced in response to the comparing ; aftd 
means for adjusting , based upon determining whether 
the data source is balanced, the sample quantity of 
household records such that the adjusted sample 



Docket No. AUS920010646US1 Page 8 of 22 

Cary, et. al. - 09/942,635 



Atty Ref. No. IBM-1036 



PATENT 



quantity of household records is balanced and 
represents the data source, the data source based on 
the determining, — wherein the adjusting results in a 
more balanced data source. 

9. (Currently Amended) The information handling system as 
described in claim 8 further comprising: 

means for matching one or more of the household records 
included in the adjusted sample quantity of household 
records records from the data source to one or more 
reference file records; 

means for generating a comparison master file based on the 
matching; and 

means for assigning an index number to each record in the 
comparison master file. 

10. (Original) The information handling system as described in 
claim 8 further comprising: 

means for retrieving a rule corresponding to an element in 
the data source from one of the nonvolatile storage 
devices ; 

means for determining whether the element in the data 
source approximates a corresponding value in the reference 
file based on the retrieved rule; and 

means for assigning a match to the element in response to 
the determination. 

11. (Currently Amended) The information handling system as 
described in claim 8 further comprising: 

means for calculating a first bias value based upon 
matching one or more of the household records included in 
the adjusted sample quantity of household records. 
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moans for matching one or more records from the data source 
to one or more reference file records; — etftd 
means for calculating a first bias value based upon the 
matching . 

12. (Currently Amended) The information handling system as 
described in claim [[8]] 1_1 further comprising: 
means for calculating a second bias value based upon 
matching one or more household records from a second data 
source to one or more reference file records; and 
wherein the selecting includes comparing the first bias 
value to the second bias value . 

moans for matching one or more records from a second data 
source to one or more reference file records; 
means for calculating a second bias value based upon the 
matching ; — aftd 

means for comparing the first bias value to the second bias 
value . 

13. (Currently Amended) The information handling system as 
described in claim 8 further comprising: 

means for identifying a first data source sample size; 
means for comparing a first data source sample 
corresponding to the first data source sample size to the 
reference file; 

means for determining a match percentage based on comparing 
each of the household records included in the adjusted 
sample quantity of household records to the reference file; 
and the — comparing ; — and 

means for calculating a new source file sample quantity 
second data source sample size by dividing the adjusted 



Docket No. AUS920010646US1 Page 10 of 22 

Cary, et. al. - 09/942,635 



Atty Ref. No. IBM-1036 



number of household records first data 



PATENT 
iplc size 



by the match percentage. 

14. (Currently Amended) The information handling system as 
described in claim 13 further comprising: 

means for identifying a second data source corresponding to 
the new source file sample quantity; and second data source 
sample size; 

moans for matching one or more records from the second data 
source to one or more reference file records; — aftd 
means for calculating a second match percentage based on 
matching one or more records from the new source file 
sample quantity to one or more reference file records, the 
matching . 

15. (Currently Amended) A computer program product stored on a 
computer operable media, the computer operable media 
containing instructions for execution by a computer, which, 
when executed by the computer, cause the computer to 
implement a method for selecting a data source vendor, the 
method comprising; in a computer operable media for 
managing a data source, — said computer program product 
comprising ; 

determining that the data source is not a subset of a 
reference file; 

in response to determining that the data source is not a 
subset of the reference file, retrieving a sample quantity 
of household records included in the data source, wherein 
the sample quantity of household records does not include 
all of the plurality of household records included in the 
data source; 
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comparing each household record included in the sample 
quantity of household records to the reference file; 
means for comparing the data source to a reference file; 
moans for determining, based upon the comparing, whether 
the data source is balanced, signifying that the sample 
quantity of household records represents the data source; 
and whether the data source is balanced in response to the 
comparing ; aftd- 

mcans for adjusting , based upon determining whether the 
data source is balanced, the sample quantity of household 
records such that the adjusted sample quantity of household 
records is balanced and represents the data source. 
data source based on the determining, — wherein the adjusting 
results in a more balanced data source. 

16. (Currently Amended) The computer program product described 
in claim 15 wherein the method further comprises: further 
comprising; 

means for matching one or more of the household records 
included in the adjusted sample quantity of household 
records records from the data source to one or more 
reference file records; 

means for generating a comparison master file based on the 
matching; and 

means for assigning an index number to each record in the 
comparison master file. 

17. (Currently Amended) The computer program product described 
in claim 15 wherein the method further comprises: further 
comprising : 

means for retrieving a rule corresponding to an element in 
the data source from the nonvolatile storage area; 
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means for determining whether the element in the data 
source approximates a corresponding value in the reference 
file based on the retrieved rule; and 

moans for assigning a match to the element in response to 
the determination. 

18. (Currently Amended) The computer program product described 
in claim 15 wherein the method further comprises: further 
comprising : 

calculating a first bias value based upon matching one or 
more of the household records included in the adjusted 
sample quantity of household records. 

moans for matching one or more records from the data source 
to one or more reference file records; — aftd- 
means for calculating a first bias value based upon the 
matching . 

19. (Currently Amended) The computer program product described 
in claim -t& 18 wherein the method further comprises: 
further comprising; 

calculating a second bias value based upon matching one or 
more household records from a second data source to one or 
more reference file records; and 

wherein the selecting includes comparing the first bias 
value to the second bias value . 

means for matching one or more records from a second data 
source to one or more reference — file records; 
means for calculating a second bias value based upon the 
matching; — aftd 

means for comparing the first bias value to the second bias 
valu e. 
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20. (Currently Amended) The computer program product described 
in claim 15 wherein the method further comprises: further 
comprising t 

moans for identifying a first data source sample size; 
means for comparing a first data source sample 
corresponding to the first data source sample size to the 
reference file; 

means for determining a match percentage based on comparing 
each of the household records included in the adjusted 
sample quantity of household records to the reference file; 
and the comparing; — aftd 

means for calculating a new source file sample quantity 
second data source sample size by dividing the adjusted 
number of household records first data source sample size 
by the match percentage. 
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